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Abstract. The burst search in LIGO relies on the coincident detection of transient 
signals in multiple interferometers. As only minimal assumptions are made about 
the event waveform or duration, the analysis pipeline requires loose coincidence in 
time, frequency and amplitude. Confidence in the resulting events and their waveform 
consistency is established through a time-domain coherent analysis: the r-statistic test. 
This paper presents a performance study of the r-statistic test for triple coincidence 
events in the second LIGO Science Run (S2), with emphasis on its ability to suppress 
the background false rate and its efficiency at detecting simulated bursts of different 
waveforms close to the S2 sensitivity curve. 
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1. Introduction 

The Laser Interferometer Gravitational- wave Observatory (LIGO) consists of three 
detectors: HI and H2, co-located in Hanford, WA and LI, located in Livingston, LA. The 
simultaneous availability of interferometric data from detectors with similar sensitivity 
and orientation allows a coherent coincidence analysis to be implemented in the search 
for bursts of gravitational waves. 

In the LIGO burst analysis pipeline [1, 2, 3], candidate events are identified as 
excesses of power or amplitude in the data stream of each interferometer by a suite 
of search algorithms, referred to as Event Trigger Generators (ETG): BlockNormal[4], 
Excess Power [5], TFClusters[6], and WaveBurst[7, 8]. The ETG tuning is tailored to 
maximize the detection efficiency for a variety of waveforms (narrow-band, broad-band 
and astrophysically motivated), with a single interferometer false rate of the order of 
1 Hz. This relatively large trigger rate is suppressed by the multi-interferometer analysis, 
which currently only requires that events be coincident in time and in frequency. The 
coincidence parameters (time window and frequency tolerance) are tuned according 
to the principle that the coincident detection efficiency should equal the product of 
efficiencies in the individual interferometers. Coincidence criteria should be loose enough 
not to further reduce the detection efficiency, within the limitations imposed by the 
false alarm rate. The coincidence analysis eventually outputs triggers (start time, 
duration) when excesses of power or amplitude have been detected simultaneously in all 
interferometers. The first step toward validation of such events is a comparison of the 
waveforms as they appear in each detector. 

This paper describes a test that exploits cross-correlation between pairs of 
interferometers and combines them into a multi-interferometer correlation confidence. 
The test is a powerful tool for the suppression of accidental coincidences without 
reducing the detection efficiency of the pipeline. Its performance has been tested on 
a 10% portion of data from the LIGO second science run (S2). 

2. The r-statistic Cross Correlation Test 

2.1. r-statistic 

The fundamental building block for the waveform consistency test is the r-statistic, or 
the linear correlation coefficient of two sequences {xi} and {yi} : 

Eijxi - x){yi - y) 

r = I _ I— _ ■ (1) 

This quantity only assumes values between -1 (fully anti-correlatcd sequences), (un- 
correlated sequences) and +1 (fully correlated sequences). More generally, if the two 
sequences are uncorrelated, we expect the r-statistic to follow a normal distribution, 
with zero mean and a — 1/ y/N, where N is the number of data points used to compute 
r. A coherent component in the two sequences will cause r to deviate from the normal 
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Figure 1. Graphical representation of the r— statistic as cosine of the angle between 

vectors in an N-dimcnsional space, (a) The two sequences arc uncorrelated: the vectors 
are orthogonal and r = 0. (b) The two sequences are correlated: as the coherent 
component dominates over the incoherent noise, r — > 1. 



distribution. If we think of data sequences as vectors in an A^— dimensional space, the 
r— statistic can be seen as an estimator for the cosine of the angle between the two vectors 
(see Figure 1). As a normahzed statistic, it is not sensitive to the relative amplitude of 
the two vectors; the advantage is robustness against fluctuations of detector response 
and noise floor. 

The number of points or, alternatively, the integration window r, is the most 
important parameter in the construction of the r— statistic. Its optimal value depends 
in general on the signal. If r is too large, the signal is "washed out" in the computation 
of r; if it is too small, statistical considerations on the distribution of r lose validity. 
Simulation studies show that a set of three integration times (20, 50 and 100 ms) is 
suitable for most short signals of interest to the LIGO burst search. A more detailed 
(and computationally intense) scan of integration windows can be implemented in a 
targeted analysis as it was done for the LIGO externally triggered search described 
in [9, 10], which is also based on cross-correlation. 

2.2. Data Conditioning 

The r-statistic test is especially effective when all coherent hues and known spectral 
features are removed from the raw strain data; for this reason, data conditioning plays 
an important role in the test. 

Raw data from each interferometer is band-passed and decimated, in order to 
suppress the contribution of seismic noise and instrumental artifacts at low and high 
frequencies and restrict the coherent analysis to the most sensitive frequency band in 
the LIGO interferometers. For the performance studies reported in this paper, the band 
of interest was 100-2048 Hz. 

Next, the data is effectively whitened by a linear error predictor filter trained 
on a 10 sec period before the event start time. This filter, described in more detail 
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Figure 2. Schematic representation of how an event trigger is scanned in the r- 
statistic test: for each value of the integration window r, the trigger is partitioned in 
Nj intervals of width r, with 50% overlap. 



in [11, 12], removes predictable content, such as lines and the spectral shape, and 
emphasizes transients. 



2.3. Trigger Scan 

The next step consists of partitioning the trigger duration in intervals equal to the 
integration window, with 50% overlap, as shown in figure 2. For each pair of 
interferometers /, m in {LI, HI, H2}, a selected vahie of the integration window p 
in {20ms, 50ms, 100ms} and each interval j, data is selected from the conditioned time 
series of two interferometers. One of the two sequences is time-shifted with respect to 
the other, yielding a distribution of r coefficients: 
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where the index k represents the time lag between the two series, in steps equal to the 
inverse of the sampling rate, covering the whole ±10 ms range to account for the light 
travel time between LIGO sites. The quantity r^^^^- assumes values in [—1, 1] but, as the 
test is mostly interested in how much the correlation coefficient deviates from 0, only 
its absolute value 1^^;^^ ! is used. 

A Kolmogorov-Smirnov test with 5% significance is used to compare the {|rp;^j |} 
distribution to the null hypothesis expectation of a normal distribution, with zero mean 
and a — 1/ -sjNp, where Np is the number of data samples in the p-th integration window. 
If the two are inconsistent, the next step is to compute the one-sided significance and 
the corresponding confidence: 
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A correlation confidence is assigned to interval j for interferometers Z, m and integration 
window p as the maximum confidence over all time lags: 

Ipimj = maxC^;^^.. (4) 

The degeneracy over interferometer pairs is solved through the arithmetical average of 
confidences from all combinations of Nifo interferometers: 

(5) 

Finally, F, the combined correlation confidence for the event, is obtained maximizing 
over all time intervals and integration windows: 

(6) 

The event passes the waveform consistency test if: 

r>p, (7) 

where (3 is the threshold imposed on the multi-interferometer correlation confidence. 
For additional details on the method and its implementation, see [12, 13]. 

3. Triple Coincidence Performance Analysis in S2 

The r-statistic test performance has been explored, independently of previous portions 
of the burst analysis pipeline, by adding simulated waveforms to real interferometer 
noise and then passing 200 ms of data around the simulated peak time through the 
r-statistic test. 

For convenience, we define here two quantities that will be used to characterize a 
burst signal, hrss is the square root of the total burst energy, in unit of strain/-\/Hz: 



hrss = \h{t)fdt = y \Hf)\'df. (8) 

This quantity can be compared directly to the sensitivity curves, as is also refiected in 
this definition of signal-to- noise ratio: 



SNR 



A| Jo Si.{f) 

where Sh{f) is the single-sided detector noise. In particular, for narrow-band bursts 
with central frequency /o, this "excess-power" definition of SNR becomes the ratio of 



hrss to the detector sensitivity at frequency /q: SNR Kss/ y Sh{fo)- In the following, 
Sh{f) will be the single-sided reference noise for the S2 run. 

The LIGO burst search has adopted sine-gaussians as standard narrow-band 
waveforms to test the search algorithms: 



= VafcSin(27r/o(t-to))e-(*-*°)'/^' ; Kss ^ KeakX (10) 
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where Q = \/27rr/o is the number of cycles folded under the gaussian envelope. As 
instances of broad-band, limited-duration bursts, one can also consider Gaussians of the 
form: 

h(t^ hpgak^ ^ ^ ^ ) hrss hp^ak \l ' (H) 

Linearly polarized signals of both types and various amplitudes have been simulated 
on top of actual interferometer noise in a 10% portion of the S2 run, referred to as the 
playground. In all cases, the same amplitude has simultaneously been injected in all 
three LIGO detectors, with no correction for antenna pattern effects. In other words, 
the results reported here are for optimal orientation; this is useful to test the sensitivity 
of the algorithm to small signals, close to the noise floor. 

Table 1. hrss [strain/\/Hz] with 50% detection efficiency for Q=9 sine-gaussians at 

various frequencies and a 1.0 ms gaussian pulse. The three columns correspond to 
different values of the /? threshold. These values, computed for the S2 playground 
(10% of the run) are affected by ~ 10% statistical and ^ 10% systematic errors. 
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Table 1 reports the resulting sensitivity of the triple-coincidence r-statistic analysis, 
quoted as the hrss value that is detected with 50% efficiency for three possible values 
of the j3 threshold. These values are affected by a ~ 10% systematic error, due to 
calibration, and a ~ 10% statistical error. 

Figures 3 and 4 show, for the Q=9 235 Hz sine gaussian and the 1.0 ms gaussian 
pulses, the efficiency curves and the location of the 50% point, relative to the S2 
sensitivity. Prom the signal-to- noise definition in eq. 9, the triple-coincidence r-statistic 
test with P — 3 results in a 50% false dismissal, or 50% detection efficiency, at SNR=3.3 
in H2 (the least sensitive detector) for narrow-band bursts at 235 Hz or SNR=4.5 for 
1.0 ms gaussian- like bursts. 

Note that the j3 = 3(4,5) threshold shown here has been selected from first 
principles, as it can be tracked back to a 10~^(10~^, 10^^) false probability in the 
correlation between two interferometers on a single interval of duration equal to an 
integration window. In order to get a more complete picture of the test efficiency for 
different values of the P threshold. Figures 5 and 6 show, for the 235 Hz sine gaussian 
and the 1.0 ms gaussian pulses, the detection probability versus false probability for 
signals between SNR=1 and SNR=10 relative to the least sensitive interferometer (H2). 
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The false probability, or the probability that an accidental event passes the r-statistic 
test, was obtained from a sample of 1.7 x 10^ 200 ms events randomly selected in the 
S2 playground. As one can see in Figure 7, this statistic is sufficient to estimate the 
false probability only up to /3 = 3; a fit to an exponential decay was used to extrapolate 
the false rate for the ROC curves to /3 > 3. The choice of /3 will ultimately be set by 
requirements on the false alarm rate in the full burst pipeline. 

In all cases considered so far, the sensitivity of the r-statistic is comparable to or 
better than that of the ETGs (see for instance [4, 7]), whose 50% detection efficiency in 
triple coincidence typically is in the SNR=5-10 range. This means the r-statistic test 
with (3 — 3,4: has a very small effect on the detection efficiency of the burst analysis 
pipeline. 

It is worth emphasizing that these false alarm probabilities have been computed 
by applying the r-statistic test to random times in the S2 dataset. However, when 
the r-statistic is fully integrated in the burst pipeline it acts on triggers that are pre- 
selected by the ETGs and their coincidence analysis. Such events share a minimal set of 
properties in all three interferometers: at the very least, they are simultaneous excesses 
of power in overlapping frequency bands. It is reasonable to expect a larger portion of 
these events to survive the r-statistic test than what is shown above for random events, 
at fixed (3. Nevertheless, a preliminary analysis of ETG background coincident triggers 
in S2 indicate the r-statistic test can effectively suppress the accidental rate by 2 — 4 
orders of magnitude, depending on the value of /3, at negligible cost for the detection 
efficiency. 

4. Conclusion 

The LIGO burst SI analysis [1] exclusively relied on event trigger generators and 
time/frequency coincidences. The search in the second science run (S2) includes a new 
module of coherent analysis: the r-statistic waveform consistency test. By thresholding 
on r, the correlation confidence of coincident events, the test can effectively suppress the 
burst false alarm rate by 2-4 orders of magnitude. Tests of the method, using simulated 
signals on top of real S2 noise, yield 50% triple coincidence detection efficiency for 
narrow-band and broad-band bursts at SNR=3-5 relative to the least sensitive detector. 
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Figure 3. Q=9 sine-gaussian signal with /p = 235 Hz. 

(a) Detection efficiency of the r-statistic test as function of the amphtude of the 
simulated signal. The three curves correspond to three values of the /3 threshold. 

(b) S2 reference sensitivity curves for the three interferometers. The star's horizontal 
position is the central frequency of the sine-gaussian (235 Hz); its vertical position is 
the hrss with 50% survival probability if /3 = 3 is used. This point corresponds to 
hrss = 2.6 X 10-21/VH^ and SNR=9 for LI, SNR=4.5 for HI and SNR=3.3 for H2. 
Sensitivity curves and hrss have units of strain/ VHz (scale to the left). The dashed 
curve represents the single-sided spectrum for the corresponding waveform, in units of 
strain/Hz (scale to the right). 
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Figure 4. As Figure 3, for Gaussian pulses with r = 1.0 ms. 

Note that for broad band signals, the characteristic frequency, the frequency that 
maximizes the SNR integrand \h{f)\'^ / Sh{f), is different for the three interferometers: 
f^^^r = 162 Hz for LI, fchar = 223 Hz for HI and f^har = 251 Hz for H2. The 50% 
detection probability, with /3 = 3 is at hrss — 6.2 x 10^^^/VHz and SNR=13 for LI, 
SNR=6.3 for HI, SNR=4.5 for H2. 
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Figure 5. Sine-Gaussian Q=9 /o = 235 Hz: Receiver Operating Characteristics, or 
detection probability vs false rate curves, parametrized with the /3 threshold. Each 
curve corresponds to a different signal amplitude: the top legend quotes the hrss-, 
while the bottom legend shows the corresponding SNR in the three interferometers. 




Figure 6. Gaussian r = 1.0 ms: Receiver Operating Characteristics, or detection 
probability vs false rate curves, parametrized with the (3 threshold. Each curve 
corresponds to a different signal amplitude: the top legend quotes the hrss, while 
the bottom legend shows the corresponding SNR in the three interferometers. 
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Figure 7. False rate versus /3 from a sample of 1.7 x 10^ events randomly selected in 
the S2 playground. The dashed line is a fit with an exponential decay, applied to data 
points with (3 > 1.5. The fit has been used to extrapolate the false rate for > 3 in 
the construction of the ROCs in Figures 5 and 6. 



